The philips/RWTH system for transcription of broadcast news
نویسندگان
چکیده
This paper contains a description of the Philips/RWTH 1998 HUB4 system which has been build in a joint e ort of Philips Research Laboratories Aachen and Aachen University of Technology. We will focus our discussion on recent improvements compared to the original 1997 HUB4 system and evaluate them on the HUB4'97 evaluation data. The paper will deal with 1. a rough system overview including feature extraction, acoustic training, audio stream segmentation, and decoding 2. log-linear interpolation of distance-language models, 3. and the integration of various acoustic and language models via Discriminative Model Combination (DMC). The performance of the described system is 23% (relative) better than the performance of the 1997 Philips HUB4 system. A word error rate of 17.9% was achieved on the 1997 HUB4 evaluation set, compared to 23.5% using the original 1997 system.
منابع مشابه
Automatic Transcription Verification of Broadcast News and Similar Speech Corpora
In the last few years, the focus in ASR research has shifted from the recognition of clean read speech (i.e. WSJ) to the more challenging task of transcribing found speech like broadcast news (Hub-4 task) and telephone conversations (Switchboard). Available training corpora tend to become larger and more erroneous than before, as transcribing found speech is more difficult. In this paper we pre...
متن کاملLarge vocabulary continuous speech recognition of Broadcast News - The Philips/RWTH approach
Automatic speech recognition of real-live broadcast news (BN) data (Hub-4) has become a challenging research topic in recent years. This paper summarizes our key efforts to build a large vocabulary continuous speech recognition system for the heterogenous BN task without inducing undesired complexity and computational resources. These key efforts included: • automatic segmentation of the audio ...
متن کاملAutomatic Transcription of English Broadcast News
In this paper the Philips Broadcast News transcription system is described. The Broadcast News task aims at the recognition of \found" speech in radio and television broadcasts without any additional side information (e.g. speaking style, background conditions). The system was derived from the Philips continuous mixture density crossword HMM system, using MFCC features and Laplacian densities. ...
متن کاملAutomatic verification of broadcast news transcriptions
In this paper we present a method for automatically detecting erroneous training scripts for speech corpora like Broadcast News and Switchboard. Based on the Hub-4 task we will report on the performance of error detection with the proposed method and investigate the effects of both manually and automatically cleaned training corpora on the performance of the RWTH speech recognition system. Our ...
متن کاملThe need to create a media block for the convergence of overseas news networks
As a general diplomacy arm of the Islamic Republic of Iran, VoSiMa has extensive activities in international broadcasting of its radio and television programs. These programs are broadcast in different languages, such as English, French, Azeri, Arabic, and ... for regional and transnational audiences. The large volume of the organization's international activities is in the form of news and new...
متن کامل